FARMS: Efficient mapreduce speculation for failure recovery in short jobs
نویسندگان
چکیده
منابع مشابه
FARMS: Efficient mapreduce speculation for failure recovery in short jobs
With the ever-increasing size of software and hardware components and the complexity of configurations, large-scale analytics systems face the challenge of frequent transient faults and permanent failures. As an indispensable part of big data analytics, MapReduce is equipped with a speculation mechanism to cope with run-time stragglers and failures. However, we reveal that the existing speculat...
متن کاملEnergy Efficient Scheduling of MapReduce Jobs
MapReduce is emerged as a prominent programming model for data-intensive computation. In this work, we study power-aware MapReduce scheduling in the speed scaling setting first introduced by Yao et al. [FOCS 1995]. We focus on the minimization of the total weighted completion time of a set of MapReduce jobs under a given budget of energy. Using a linear programming relaxation of our problem, we...
متن کاملAn Efficient Solution for Processing Skewed MapReduce Jobs
Although MapReduce has been praised for its high scalability and fault tolerance, it has been criticized in some points, in particular, its poor performance in the case of data skew. There are important cases where a high percentage of processing in the reduce side is done by a few nodes, or even one node, while the others remain idle. There have been some attempts to address the problem of dat...
متن کاملFP-Hadoop: Efficient processing of skewed MapReduce jobs
Nowadyas, we are witnessing the fast production of very large amount of data, particularly by the users of online systems on the Web. However, processing this big data is very challenging since both space and computational requirements are hard to satisfy. One solution for dealing with such requirements is to take advantage of parallel frameworks, such as MapReduce or Spark, that allow to make ...
متن کاملOnline Aggregation for Large MapReduce Jobs
In online aggregation, a database system processes a user’s aggregation query in an online fashion. At all times during processing, the system gives the user an estimate of the final query result, with the confidence bounds that become tighter over time. In this paper, we consider how online aggregation can be built into a MapReduce system for large-scale data processing. Given the MapReduce pa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Parallel Computing
سال: 2017
ISSN: 0167-8191
DOI: 10.1016/j.parco.2016.10.004